Networked audio output device in an audio video distribution system

ABSTRACT

This disclosure describes a networked audio output device in an audio video distribution system that uses a local area network that includes a network speaker node; and an audio output device that transmits the analog audio signal and that couples to the network speaker node through the speaker/microphone driver. The network speaker node further includes a controller with a network interface that couples to the local area network, where the controller further comprises an embedded controller with memory and which is programmed to function as a web server. The network speaker node further includes a digital signal processor that couples to the controller. And, the network speaker node further includes a speaker/microphone driver that couples to the digital signal processor.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims the benefits of the earlier filed U.S.Provisional Application Ser. No. 60/379,313, filed 9 May 2002, which isincorporated by reference for all purposes into this specification.

Additionally, this application is a National Stage application ofInternational Patent Application PCT/US2003/014603, filed 8 May 2003,which is incorporated by reference for all purposes into thisspecification.

Additionally, this application is a continuation of U.S. patentapplication Ser. No. 10/513,737, filed 4 Nov. 2004, which isincorporated by reference for all purposes into this specification.

Additionally, this application is a continuation of U.S. patentapplication Ser. No. 11/467,340, filed 25 Aug. 2006, which isincorporated by reference for all purposes into this specification.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to audio video distribution systems. Morespecifically, the present invention relates to networked audio outputdevices in an audio video distribution system.

2. Description of the Related Art

Currently, most audio speakers are passive devices that receive ananalog or digital audio signal. A few advanced models have limitedself-diagnostics that can be communicated out over additional wire runsas well. These speakers are usually wired to racks or source switchingpre-amps and amplifiers. The problem with this approach is that thesesystems are not very flexible. It is hard to expand the audio sourcesthat can be heard through the speakers embedded in walls or other placesafter the system has been installed without buying and installingadditional costly components. Other audio sources include as homecontrol system voice communication, intercom audio, soundtracks forCD-ROM games, solid-state sound memories. Digital audio broadcastingsystems, and even Internet audio can not easily be added and routedthrough to the existing speakers at a future date if the existing systemwas not originally designed to input and handle it. This is mostly dueto the ongoing proliferation of new audio compression formats.High-quality digital audio data takes a lot of hard disk space to store(or channel bandwidth to transmit). Because of this many companies haveworked on compressing and or coding of the bit stream to allow for asmaller binary footprint. This allows for high quality music to take upless storage space and to be transported across vast networks with asmaller amount of data, and therefore less bandwidth. However, these newcompression and encoding formats require that un-compression anddecoding be performed to reconstitute the original audio before it isplayed out the loudspeaker. If an existing audio system is limited toreconstituting only audio formats known at the time of installation, theaudio system quickly becomes obsolete.

Many new products have wireless network capabilities, but still cannotbe easily connected into a home network, because of a lack of easilyaccessible wireless to wired network bridging within range of thedevice. This can especially be a problem if the wireless device is ahandheld mobile unit such as a PDA, and due to a lack of access points,cannot communicate from all rooms in the house.

The current approach to controlling audio and doing home automation isoften cumbersome. The sound system remote that allows the room audiolevel to be adjusted does not allow the room lights to be dimmed.Therefore, different remote controllers for each function are needed.Nor do users like the “wall clutter” created by putting separatemultiple audio and other home network control units in the walls.Wireless solutions to this problem such as Radio Frequency, known as RF,or Infra-Red, called IR, have limitations. The biggest limitation for RFis that in many large cites, the RF noise background is very high,creating communication problems, and there may be health concerns withexcessive RF. The IR limitation is that IR is effective in “line ofsight” only, and the home automation devices to be controlled may be inother rooms. These problems are compounded in retrofit situations wheredie minimal changes that affect the current building and existingsystems are desired.

It is therefore the object of this invention to provide a networkedspeaker, so that an audio distribution system can be created that isintegrated with the home automation system into a home network thatpermits vocal feedback, status, and even control with the audio throughthe network speakers. The network should let the user know what ishappening, and provide very intuitive instruction on how to use thesystem. This will enable the audio speakers to easily adjust to andallow new audio sources and to become wireless access points in thehome, or provide the wireless bridge to the hard-wired network.

SUMMARY OF THE INVENTION

This disclosure describes a networked audio output device in an audiovideo distribution system that uses a local area network. The networkedaudio output device includes a network speaker node; and an audio outputdevice that transmits the analog audio signal and that couples to thenetwork speaker node through the speaker/microphone driver. The networkspeaker node further includes a controller with a network interface thatcouples to the local area network and controls the processing of thedigital audio signal, where the controller further comprises an embeddedcontroller with memory and which is programmed to function as a webserver. The network speaker node further includes a digital signalprocessor that couples to the controller, where the digital signalprocessor processes and formats the digital audio signal and the analogaudio signal and converts the digital audio signal to and from theanalog audio signal. And, the network speaker node further includes aspeaker/microphone driver that couples to the digital signal processorand provides the external connection for the analog audio signal.

The networked audio output device further provides that the audio outputdevice may include headphones or include one or more speakers.

Additionally, the networked audio output device further includes speakersensors that couple to the digital signal processor and provide feedbackand allow for sending control signals back to other devices in the localarea network.

DESCRIPTION OF THE DRAWINGS

To further aid in understanding the invention, the attached drawingshelp illustrate specific features of the invention and the following isa brief description of the attached drawings:

FIG. 1 is a block diagram of an audio distribution system.

FIG. 2 is a block diagram of a network speaker embodiment of the systemshown in FIG. 1.

FIG. 3 is a block diagram of another network speaker embodiment.

FIG. 4 is a block diagram of another network speaker embodiment.

FIG. 5 is a block diagram of another network speaker embodiment.

FIG. 6 is a block diagram of another network speaker embodiment.

FIG. 7 is a block diagram of the internal components of a networkspeaker embodiment.

FIG. 8 is a block diagram of a Legacy Audio Converter/Controller for usein the system shown in FIG. 1.

FIG. 9 is a block diagram of a network speaker including power options.

FIG. 10 is a network speaker including battery powered options and anenergy storage module.

DETAILED DESCRIPTION OF THE INVENTION

An audio distribution network system 20 (FIG. 1) includes a plurality ofspeaker node units 100 which are coupled to a Transport ControlProtocol/Internet Protocol (TCP/IP) based network backbone 200. Alsocoupled to the network backbone 200 are networked audio source nodedevices 300, an Internet service interface 400, and a Legacyconverter/controller 600. Legacy sources 500 provide analog or digitallinear PCM (Pulse Coded Modulation) audio to be converted into a packetswitched digital coding for transport across the network. They will alsoprovide analog video which will be used for control status feedback, aswell as conversion to a packet switched digital coding for transportacross the network. In addition, the Legacy sources 500 will alsoreceive IR or serial commands from the converter/controller 600 whichalso communicates with a Legacy home control network 700. Some legacysources 500 may also provide serial communications to theconverter/controller 600.

The networked audio source devices 300 can consist of any number ofnetworked digital audio source devices (music playback devices) such aspersonal computers or audio servers that are able to communicate withone another over the shared TCP/IP network 200 and have the resources toserve digital audio files (WMA, MP3, Corona, etc.) to the network. Bitstreamed audio (digital music, in the form of binary data that is sentin packets) from the Internet also may enter the system 20 from theInternet interface 400. The Legacy audio devices 500 (existing analogaudio equipment, i.e. CD players, tape decks, VCR's) have their audioconverted into a packet switched digital network format (WMA, MP3,Corona) by the Legacy Converter 600 or by the network speakers 100. Thenetwork speaker 100 can also real time encode sound received from itsinternal microphone or from reversing the transduction circuit from thespeaker to perform the act of capturing sound waves present in the room,and then coding that sound and providing it for use on the network 20,including by use of differential masking for control purposes. Any newdevice that is able to send audio out on the network can serve as theaudio source for a network speaker 100 as long as the network speaker100 understands the audio format. Control commands that affect the audiodistribution can come from the network control server 310, network audiosource devices 300, the Internet interface 400, the legacy home controlnetwork 700 via the legacy converter/controller 600, or from othernetwork speakers 100.

The system 20 is a collection of independent computers or otherintelligent devices that communicate with one another over the sharedTCP/IP network 200. For example, the system 20 can be part of theInternet linked networks that are worldwide in scope and facilitate datacommunication services such as remote login, file transfer, electronicmail, the World Wide Web and newsgroups, or for security reasons part ofa home intranet network utilizing Internet-type tools, but availableonly within that home. The home intranet is usually connected to theInternet via an Internet interface 400. Intranets are often referred toas LANs (Local Area Networks).

The home network backbone 200 communicates using the TCP/IP networkprotocol consisting of standards that allow network members tocommunicate. A protocol defines how computers and other intelligentdevices will identify one another on a network, the form that the datashould take in transit, and how this information is processed once itreaches its final destination. Protocols also define procedures forhandling lost or damaged transmissions or “packets”. The TCP/IP networkprotocol is made up of layers of protocols, each building on theprotocol layers below it. The basic layer is the physical layer protocolthat defines how the data is physically sent through the physicalcommunication medium, such as Thickwire, thin coax, unshielded twistedpair, fiber optic, telephone cable, fiber optic cable, RF, IR, powerline wires, etc. Those physical media requiring an actual physicalconnection of some type, such as Thickwire, thin coax, unshieldedtwisted pair, fiber optic, power line, telephone cable, or fiber opticcable, to the network device are called wired media Those physical medianot requiring an actual physical wire connection of any type to thenetwork device, such as RF and IR, are called wireless media. A TCP/IPhome network can be totally wired, totally wireless, or a mix ofwireless and wired. A TCP/IP home network is not limited to a singlephysical communication medium. Different physical communication mediacan be connected together by bridging components to create a unifiedcommunication network. Each network physical media has its physicallayer protocol that defines the form that the data should take intransit on that particular physical media. The bridging componentenables the transfer and conversion of communication on one physicalmedium and its physical layer protocol to a different physical media andits physical layer protocol. Bridging components also may provide aproxy from one network to the other, this will be common among UpnP V1to V2, and with Ipv6 to Ipv4 (Internet Protocol version 6, 4). Commonphysical layer LAN technology in use today include Ethernet, Token Ring,Fast Ethernet, Fiber Distributed Data Interface (FDDI), AsynchronousTransfer Mode (ATM) and LocalTalk. Physical layer protocols that arevery similar over slightly different physical media are sometimesreferred to be the same name but of different type. An example are thethree common types of Fast Ethernet: 100 BASE-TX for use with level 5UTP cable, 100BASE-FX for use with fiber-optic cable, and 100BASE-T4which utilizes an extra two wires for use with level 3 UTP cable. TheTCP/IP protocol layers are well known and will not be further describedin greater detail.

The system 20 may have any number of networked self-sufficient digitalaudio source devices 300 in it, such as a digital music storage device,PC, music player, personal Digital Assistant (PDA), on board automobilemusic system, digital integrated audio equipment, personal digitalrecorder or video digital recorder. Networked audio source devices 300can provide digital audio files such as WMA, MP3, “Corona”, and MLP fromits hard disk, internal flash, or an audio input such as a microphone orCD reader or music player. The system 20 may also have any number ofnetwork control servers 310 that can encompass a specialized networkserver, usually a specialized, network-based hardware device designed toperform a single or specialized set of server functions. It is usuallycharacterized by a minimal operating architecture, and client accessthat is independent of any operating system or proprietary protocol.Print servers, terminal servers, audio servers, control remote accessservers and network time servers are examples of server devices whichare specialized for particular functions. Often these types of servershave unique configuration attributes in hardware or software that helpthem to perform best in their particular arena. While specializedhardware devices are often used to perform these functions in largesystems, the specialized functions served by the network server could beperformed by a more general use computer. A single computer, (sometimesreferred to as a RISC (reduction instruction set computer), called a webserver, could combine the functionality of the networked audio sourcedevices 300 and the Internet interface 400. If no connection to theInternet is desired, the Internet interface 400 function can be removedfrom the system without loss of intranet network integrity. Network andweb servers are well known and will not be described in greater detail.

The legacy home control network 700 is an existing network of devices inthe home used to automate and control the home. If the legacy homecontrol network 700 can not communicate over a shared TCP/IP network200, it cannot directly control or be controlled by the networkspeakers, and the two dissimilar networks must be bridged by a LegacyConverter/Controller 600. Any legacy home control network 700 that cancommunicate within the system 20 over a shared TCP/IP network could becombined into the home network backbone 200 and then the legacy homecontrol network 700 device would have access to and be able to controlthe network speaker 100 if it has the resources and instructions to doso. The Legacy Converter/Controller 600 can also be used to providenetwork access to un-networked legacy devices that are able to serve ascommand and control interfaces such as the telephone, cell phone, RFremote, IR remote, direct voice controller or keypad. A networked audiosource 300 such as a PDA, also can act as the legacyconverter/controller for a legacy device such as an attached cell phone.

The legacy home audio sources 500 are other audio sources that are notable to communicate over a shared TCP/IP network 200, such as analogaudio players, CD players, video game players, tape players, telephone,VCRs or other audio sources that are not able to communicate over ashared TCP/IP network 200. The legacy Converter/Controller 600 takes theanalog or digital linear PCM audio from the Legacy home sources 500,converts it into an acceptable digital format or formats if needed, andserves the audio as needed over the shared TCP/IP home network backbone200. If the legacy home audio source 500 provides an analog audio to theLegacy Converter/Controller 600, the Legacy Converter/Controller 600must convert the analog audio into an appropriate digital audio formatbefore serving it to the network. The Legacy Converter/Controller 600can also convert commands sent from the home network 200 to the legacyhome source 500 into a command format that is understood by the legacyhome source 500, such as serial, RF or IR commands. A system may havemultiple Legacy Converter/Controllers 600 for each legacy home source500 or legacy home control network 700, or a Legacy Converter/Controller600 may convert and control more than one legacy home source 500 ormultiple Legacy home control networks 700.

Illustrated in FIG. 2 is one network speaker embodiment 100A. A networkinterface 110 couples the network backbone 200 of the system 20 (FIG. 1)to a network controller 120 which feeds a digital to analog converter(DAC) 122 via an audio format converter 121. Receiving an output fromthe DAC 122 is a pre-amplifier 123 which also receives inputs fromspeaker sensors 124. An amplifier 125 receives the output of the pre-amp123 and feeds a speaker/microphone driver 126 coupled tospeaker/microphone components 127.

The network speakers 100A may be enclosed in a case or box, in a ceilingembedded in or behind a wall, or in a car and constitute the mostprevalent enabling components in the system 20. Each network speaker100A communicates to the network backbone (Ethernet) 200 through thenetwork interface 110 that handles the physical layer hardware protocol.The network interface 110 may connect to one or more physical layers,wired or unwired or both. From there the Network Speaker Controller 120provides the intelligence to run the various application features of thenetwork speaker, including the higher levels of the TCP/IP protocol.Audio sources (Digital Music content) received from the network andaddressed to a particular network speaker 100A are sent to the audioformat converter 121 that converts the source digital audio format intoa form ready to be converted to analog. The correctly re-formatteddigital signal is sent to the digital to DAC 122 to be converted fromdigital to analog. The analog signal then goes to a pre-amp 123 wherethe signal is adjusted and filtered. Included in the pre-amp 123 can bean active crossover which operates at preamp level to limit thefrequencies to the amplifier or amplifiers connected to it. The speakercomponents connected to these pre-amplifiers would therefore receive alimited frequency range, and can be optimized for the frequenciesreceived. The pre-amp signal then goes to the amplifier section 125, andthe amplified signal proceeds to the speaker/microphone driver 126 andout the speaker/microphone components 127 to become audio sound waves.Because the application software in the Network Speaker controller 120and audio format converter 121 can be updated over the network and withthe use of sufficient processing power, and presence of ample memory,the network speaker 100A can be made to play currently unknown digitalformats in the future. The audio format converter 121 may have the DAC122 built in. The speaker sensors 124 which may include temperature, SPL(such as a baffle microphone), ambient and noise floor, pressure, andvoltage sensors provide the on board application speaker feedback whichenables internal auto adjustment to enhance speaker protection andperformance and allow for sending control signals back to other deviceswhich may need/want the status information. A very useful applicationfor this would be for the use of differential masking. This is a processin which you are comparing samples from the digital source against thereal time encoding samples from within the air space. The originaldigital source is then subtracted from the combined real time encodingand the result is a new sample.

The network interface 110 connects the network speaker 100A to theactual network backbone 200 and will vary depending on the physicalmedia and physical layer protocol used. Network interface cards,commonly referred to as NICs, are often used to connect PCs to a wirednetwork, and are used in the preferred embodiment when the networkbackbone is some form of wired cable or fiber optics. The NIC provides aphysical connection between the networking cable and the computer'sinternal bus. Different computers have different bus architectures; themost common are PCO found on 486/Pentium PCs and ISA expansion slotscommonly found on 386 and older PCs. NICs come in three basic varieties:8-bit, 16-bit, and 32-bit. The larger the number of bits that can betransferred to the NIC, the faster the NIC can transfer data to thenetwork cable. Many NIC adapters comply with Plug-n-Play specifications.On these systems, NICs are automatically configured without userintervention, while on non-Plug-n-Play systems, configuration is donemanually through a setup program and/or DIP switches. Cards areavailable to support almost all networking standards, including thelatest Fast Ethernet environment. Fast Ethernet NICs are often 10/100capable, and will automatically set to the appropriate speed. Fullduplex networking is another option, where a dedicated connection to aswitch allows a NIC to operate at twice the speed. NIC cards withmultiple terminations capable of supporting multiple physical layerprotocols or within protocol types are to be preferred. Within the NICcards are transceivers used to connect nodes to the various Ethernetmedia. Most computers and network interface cards contain a built-in10BASE-T or 10BASE2 transceiver, allowing them to be connected directlyto Ethernet without requiring an external transceiver. Many Ethernetdevices provide an AUI connector to allow the user to connect to anymedia type via an external transceiver. The AUI connector consists of a15-pin D-shell type connector, female on the computer side, male on thetransceiver side Thickwire (10BASE5) cables also use transceivers toallow connections. For Fast Ethernet networks, a new interface calledthe MIII (Media Independent Interface) was developed to offer a flexibleway to support 100 Mbps connections. The MII is a popular way to connect100BASE-FX links to copper-based Fast Ethernet devices. Wirelessbackbone physical layer network connections are made using RF networkreceivers made by companies such as Linksys, Cisco, IBM, DLINK andothers, using wireless protocols such as 802.11x, UWB (ultra wideband),Bluetooth, and more as the network interface 101.

The network speaker controller 120 is an embedded controller with flashmemory programmed to function as a web server. The network speakercontroller 120 and the audio format converter 121 are enabled to allowtheir application programming to be updated over the network, thenetwork speaker can be made to play currently unknown digital formats inthe future. The audio sources received from the network most likely willbe in an encoded and/or compressed format. Digital audio coding ordigital audio compression is the art of minimizing storage space (orchannel bandwidth) requirements for audio data. Modern perceptual audiocoding protocols, synonymously called digital audio compressiontechniques, like MPEG Layer-III or MPEG-2 AAC, ATRACK3, WMA, Ogg Vorbis,or “Corona”, and even a packet switched Dolby Digital (AC3 over Ipv6),exploit the properties of the human ear (the perception of sound) toachieve a respectable size reduction with little or no perceptible lossof quality. This compression is usually more than just reducing thesampling rate and the resolution of your samples. Basically; this isrealized by perceptual coding techniques addressing the perception ofsound waves by the human ear, which remove the redundant and irrelevantparts of the sound signal. The sensitivity of the human auditory systemsfor audio signals varies in the frequency domain being high forfrequencies between 2.5 and 5 kHz and decreasing beyond and below thatfrequency band. The sensitivity is represented by the Threshold In Quietso that any tone below the threshold will not be perceived. The mostimportant psychoacoustics fact is the masking effect of spectral soundelements in an audio signal like tones and noise. For every tone in theaudio signal a masking threshold can be calculated. If another tone liesbelow this masking threshold, it will be masked by the louder tone andremains inaudible, too. These inaudible elements of an audio signal areirrelevant for the human perception and thus can be eliminated by theencoder. The result after encoding and decoding is different from theoriginal, but it will sound more or less the same to the human ear. Howclosely it would sound to the original depends on how much compressionhad been performed on it.

Audio compression really consists of two parts. The first part, calledencoding, transforms the digital audio data that resides, say, in a WAVEfile, into a highly compressed form called bitstream (or coded audiodata). To play the bitstream on your soundcard, you need the secondpart, called decoding. Decoding takes the bitstream and reconstructs itto a WAVE file. Highest coding efficiency is achieved with algorithmsexploiting signal redundancies and irrelevancies in the frequency domainbased on a model of the human auditory system. Current coders use thesame basic structure. The coding scheme can be described as “perceptualnoise shaping” or “perceptual sub-band/transform coding”. The encoderanalyzes the spectral components of the audio signal by calculating afilterbank (transform) and applies a psychoacoustics model to estimatethe just noticeable noise-level. In its quantization and coding stage,the encoder tries to allocate the available number of data bits in a wayto meet both the bit rate and masking requirements. The decoder is muchless complex. Its only task is to synthesize an audio signal out of thecoded spectral components.

The term psychoacoustics describes the characteristics of the humanauditory system on which modern audio coding technology is basedproviding audio quality of a coded and decoded audio signal the qualityof the psychoacoustics model used by an audio encoder is of primeimportance. Audio data decompression and de-coding of audio formats intothe audio format acceptable the loudspeaker is performed by the audioformat converter 121. This audio format conversion of different formatsallows high quality low bit-rate applications, like soundtracks forCD-ROM game, solid-state sound memories, Internet audio, or digitalaudio broadcasting systems to all be played over the same speaker. Theaudio format converter 121 function in the current embodiment of thenetworked speaker will be performed by an audio coding and decoding chipset (CODEC). CODEC hardware and or software is currently available fromsuch companies as Micronas, Sigmatel, TI, Cirrus, Motorola, Fraunhofer,and Microsoft. CODECS handle the many current encoding protocols such asWMA, MPEG-2 AAC, MP3 (MPEG Layer III), MPSPro, G2, ATRACK3, MP3PRO,“Corona”, (WMAPro) Ogg-Vorbis and others. To best perform the audioformat conversion function, the CODEC should be designed to handle alltypes of audio content, from speech-only audio recorded with a lowsampling rate to high-quality stereo music. The CODEC should be veryresistant to degradation due to packet loss, and have an efficientencoding algorithms to perform fast encodes and decodes, and to minimizethe size of the compressed audio files, and still produce quality soundwhen they are decoded. In addition, the CODEC should be highly scalableand provide high-quality mono or stereo audio content over a wide rangeof bandwidths, to allow selection of the best combination of bandwidthand sampling rate for the particular content being played or recorded.Content encoded at 192 Kbps by the CODEC should be virtuallyindistinguishable to a human ear from content originating on a compactdisc. This extremely high-quality content is called CD transparency. Apreferred embodiment of this invention uses the Windows Media Audio(WMA) Audio CODEC by Microsoft. The audio format converter 121 functioncould also be performed by a decoder chip with no encoder functionalityif no digital audio reformatting or digital encoding of analog audio isdesired.

The digital to analog converter 122, converts a digital input into ananalog level output. At the pre-amp 123, the analog signal is adjustedand filtered, and any desired active or electronic crossover may beperformed. An electric crossover is a powered electronic circuit whichlimits or divides frequencies. Most electronic crossovers have outputcontrols for each individual channel. This allows you to set the gainsfor all amplifiers at one convenient location, as well as the ability tolevel match a system. Some crossovers will allow you to set the low andhigh pass filters separately, which allows you to tune out acousticpeaks or valleys at or near the crossover frequencies. One of theadvantages of electronic crossovers is that there is little or noinsertion loss. Passive crossovers reduce the amplifier power slightly,due to their resistance. Another advantage of electronic crossovers isthe ability to separate low frequencies into their own exclusiveamplifier, which reduces distortion heard at high volumes in the highfrequency speakers. Amplification of low frequencies requires greaterpower than higher frequencies. When an amplifier is at or near peakoutput, clipping may occur, which is able to destroy tweeters and otherspeakers with small voice coils. A separate low frequency amplifierallows the total system to play louder and with lower distortion. Anadjustable crossover allows the user to make crossover changes easilyand to immediately hear the effect of the changes. Changing the filters,or crossover points, lets users adjust the audio to meet theirpreferences. The electronic crossover, by limiting the frequencies tothe amplifier or amplifiers connected to it, also ensures that thespeakers which are connected to these amplifier(s) would thereforereceive a limited frequency range, and these speakers can be optimizedfor the frequencies received. It also enables personal preferences infrequency range pre-amplification adjustment. The advantages of usingactive filters are that they are built onto the pre-amp circuit board.Changing the filters (or crossover points) is usually accomplishedthrough external dial turning, by changing frequency modules with aswitch or by changing crossovers if fixed types are used. An adjustablecrossover is preferred.

The amplifier 125 is comprised of one or more amplifier circuits thatamplify the audio signal to the desired final signal strength. Usingmultiple amplifiers takes advantage of the crossover frequency filteringto optimize the amplifier for the frequency range received. Amplifiersusing the latest in digital amplifier technology that can efficientlyproduce large amounts of power with a much smaller heat sink than inpast designs are preferable, and this also will eliminate the need foranother DAC. The speaker/microphone driver 126 is comprised of one ormore speaker drivers circuits. Using multiple drivers for multiplespeakers allows the speakers to be optimized for the frequency rangereceived. The speaker/microphone components 127 convert the signal tosound and are voiced and designed to handle a wide dynamic range ofaudio frequencies and are able to aid in the accurate reproduction ofsound from a digital source.

FIG. 3 shows another network speaker embodiment 100B. The speakerembodiment 100B includes all of the components of the speaker embodiment100A and identical components bear the same reference numerals. Inaddition, speaker embodiment includes an analog to digital converter(ADC) 128 and a modified speaker/microphone driver 126 b. Thespeaker/microphone driver 126 b circuitry is expanded to serve as bothan output driver and a microphone input for half duplex operation (oneway conversations), and an internal microphone can implement a fullduplex operation (simultaneous two way conversations). The microphoneinput is sent to the pre-amp 123 for signal adjustment and filtering.From there it is sent to the analog to digital converter 128 to convertthe analog signal to a simple digital format. The audio format converter121 then takes the digital microphone input and compresses and encodesit into a desired format for distribution. The encoded format of whichmay vary, depending on the application is sent to the network controller120 where, depending on the software application and programming, itsfinal destination and function are determined. The input may be storedlocally for future audio feedback, used locally, or it may be sent outto the network through the network interface 110. The input could beused with a voice recognition application to initiate spoken audio orhome control commands. Speaker sensors 124 feedback received by thepre-amp 123 can also be sent to the ADC 128 to be converted from analogto digital format, and then passed on to the network controller 120.Depending on the network controller 120 applications, the feedback canthen be sent out to network interface 110 onto the network backbone 200as an alarm or other condition if desired. Additional features in theaudio format converter 121 in conjunction with application softwarecould enable the ability to change audio setting(s) based on the type ofmusic that is being played, or even the user playing it, or Time of Day(TOD). The network speaker 100B may have the ability, through the audioformat converter 121 or other circuitry, to support headphones.

FIG. 4 depicts another network speaker embodiment 100C with wirelessremote control access. All components of speaker embodiment 100B arepresent in speaker embodiment 100C and bear the same reference numerals.In addition, additional components provide wireless remote control fromIR and RF remotes. It should be noted that the additional componentscould have been added to the network speaker embodiment 100A as well. Aninternal IR sensor 131 senses IR from one or more external IR remotes170. The sensed IR is sent to an IR receiver 130 that processed the IRinput, and the processed IR input is sent to the network controller 120which then performs commands as per its application software. Ifdesired, the IR sensor 131 may be external of the speaker 100C whichthen can be installed behind a wall as wall speakers, and still receiveIR. The network controller 120 can send the processed IR commands outthe network interface 110 onto the network to be processed remotely bythe Legacy Converter/Controller 600 which then translates them intocommands to the legacy sources 500. Alternatively, the networkcontroller 120 can send the processed IR commands out the networkinterface 110 onto the network to be processed remotely by the legacyConverter/Controller 60 which then translates them into Legacy homecontrol network 700 commands. In the same manner, RF control access isprovided by a RF Sensor/Transceiver 135 which receives input from RFremotes 175 and other network speaker transceivers, and transmitsinformation to the network controller 120. While this embodiment 100Cshows both IR and RF access through the same network speaker, it will beappreciated that IR only control access or RF only control access couldbe implemented.

The wireless control access allow IR or RF input to the speakers 100C tobe used to remotely control the system 20 including control of theaudio, (including multi destination sync), video, HVAC, security, roomlight level house scenes, etc., if the system is so programmed. Wherethe software application includes the ability to “learn” new IR commandsand associate them with audio or house control commands, existing legacysources with IR remotes can be integrated into the network controllerthrough the legacy Converter/Controller 600. And because the legacyConverter/Controller 600 is upgradeable over the network, the networkspeaker IR input ability could be made to control currently unknownsystem devices in the future.

FIG. 5 shows another network speaker embodiment 100D that serves asbridge between one or more wireless network devices and a wired segmentof the network 200, known as a wireless access point. This wirelessaccess point embodiment includes the components of embodiment 100B withadditional components added for wireless-wired bridging, such as dualmode ad-hoc to infrastructure mode. The network 200 consists of at leastone physically wired network section 240 and at least wireless networksegment 250. The network interface 110 consists of two parts, a wirednetwork interface 111 connecting the network speaker 100D to the wirednetwork backbone 240 and an RF network interface 112 connecting thenetwork speaker 100D to the wireless RF network backbone 250. Networkcommunication can pass between the wired backbone 240 and the wirelessRF network backbone 250 via the network speaker 100D. The RF networkinterface 112 consists of an RF receiver/transmitter capable of bothreceiving and sending RF network communication.

FIG. 6 illustrates another speaker embodiment 100E that has wirelesscontrol access and that serves as a wireless access point. This wirelessaccess point embodiment includes all of the components of embodiments of100B, 100C and 100D.

If a home has a network speaker type system, the application softwareopens all kinds of possibilities. New sources or new source content mayenable these intelligent speakers 100 to have more features and playbackformats that are not in existence today, and to adjust to the sourcecontent. An example of this would be the ability to change audiosettings based on the type of music that is being played, or even theuser playing it, or Time of Day (TOC). This will be highly customizablelong past the time of installation, to keep the audio system upgradeablewithout structural changes to the home even if the network speakers areembedded in walls and other not easily accessed locations. In addition,a network speaker 100 with a microphone and the appropriate applicationsoftware could record and route messages digitally to any house networknode or internet node; locate and identify a user in a room, which inturn enables the system 20 to route voice mail and message to the roomthe user is presently in on demand; locate and identify a room user,which in turn enables the system to route voice mail and message to theroom the user is presently in on demand; serve as a voice recognitionand authorization point to enable direct voice control of any node onthe network or any legacy audio source 500 or legacy control network 700device that may be connected to the network 200 through a legacyconverter/controller 600; or to automatically record and/or route voicemessages from one user to the room in which the recipient identified inthe voice message is currently located. Multiple network speakers 100with microphones in one room could even triangulate the location of theuser, which in turn enables the system to optimize the audio for theusers current location.

The network speaker 100 with a sufficient memory and the appropriateapplication software could store voice mail to be played on demand bythe room user or in a totally wireless network 200 serve as a wirelessrepeater within a home if the wireless communication medium signalstrength was insufficient to reach all rooms or areas of the home fromall locations. Also, a strategically placed network speaker 100 servingas a wireless access point allows the communication of audio, data,commands or any other communications from mobile network nodes wheneverthey are within communication range, such as PDAs, mobile controllers,mobile computers, wireless headphones, or network speakers 100 in mobileunits such as automobiles.

A network speaker 100 with IR or RF receivers and the appropriateapplication software would allow wireless remote control, status andfeedback from any IR or RF remote, or other network speaker transceiver,to any node on the network or any legacy audio source 500 or legacycontrol network 700 device that may be connected to the network 200through a legacy converter/controller 600. A network speaker 100 with aRF receiver capable of transmitting RF could enable wirelessnon-networked headphones. Also, a network speaker 100 could encode andsend and transmit sound and images from a room out on the network, aswell as act as the source point for room control and automation andvoice recognition services for control and automation. In addition, anetwork speaker 100 could participate in a multi speaker session duringwhich each network speaker 100 could perform as a master or slave mode.A network speaker 100 in the master mode would control and distributemulti session clocks and this is where they would originate and becalculated. The network speaker 100 in the slave mode would receive viaTCP/IP and/or RF clocking information from the master in a multi sessionmode.

A network speaker 100 additionally could be an audio source locallywithin the room via internal solid-state memory as well as terrestrialanalog reception (AM/FM/CATV) if components were added to receive andplay back digital and analog terrestrial radio frequencies (AM/FM/CATV).

FIG. 7 is a block diagram that shows a baffle microphone 124 and a tuner162 coupled to a DSP (digital signal processor) 160. This figureillustrates that DSP 160 (also described as a signal processor meanselsewhere in the original disclosure) is an alternative embodiment tothe use of one or more of the following components that include aseparate Audio Format Converter 121, an ADC 128, a DAC 122, a Pre-Amp123, and an Amplifier 125 as illustrated in FIGS. 1-6 and 8-10. The DSP160 may include a real time adaptive analyzer to process information.DSP 160 may also include a speaker controller for providing intelligenceto operate application protocol.

FIG. 8 depicts a legacy Audio Converter/Controller 600 embodiment, whichincludes many similar components as the Network Speaker 100. The legacyAudio Converter/Controller 600 communicates with the network backbone(Ethernet) 200 through a network interface 610 which handles thephysical layer hardware protocol and may connect to one or more physicallayers, wired or unwired or both. Coupled to the network interface 610is a Network Controller 620 which provides the intelligence to runvarious application features of the legacy Audio Converter/Controller600, including the higher levels of the TCP/IP protocol. The NetworkController 620 controls an audio format converter 621 which converts thelegacy source audio into the desired network digital format fordistribution. Digital audio from legacy sources 500 are transmitteddirectly to the audio format converter 621 to be re-formatted into thedesired digital format. Analog audio from legacy sources 500 are fed toan analog to digital converter (“ADC”) 622, and the resultant digitizedsignal then goes to the audio format converter 621 to be coded into thedesired digital format. The Network Controller 620 takes the properlyformatted digital audio and sends it to the network 200 via the networkinterface 610. Also, the audio format converter 621 may consist ofmultiple encoders to provide multiple conversions of different legacyaudio inputs simultaneously. The Legacy Converter/Controller 600 usesthe analog video from the legacy source device for encoding to a packetswitched digital format such as WMAPro “Corona”, and also uses theanalog video inputs for power status and feedback.

The network interface 610 may vary depending on the physical medium andphysical layer protocol used. Network interface cards, commonly referredto as NICs, are often used to connect a PC to a wired network, and areused in the preferred embodiment when the network backbone is some formof wired cable or fiber optics. Such a NIC provides a physicalconnection between the networking cable and the computer's internal bus.Different computers have different bus architectures; the most commonare PCI found on 486/Pentium PCs and ISA expansion slots commonly foundon 386 and older PCs. Typically NICs come in three basic varieties:8-bit, 16-bit, and 32-bit. The larger the number of bits that can betransferred to the NIC, the faster the NIC can transfer data to thenetwork cable. Many NIC adapters comply with Plug-n-Play specifications.On these systems, NICs are automatically configured without userintervention, while on non-Plug-n-Play systems, configuration is donemanually through a setup program and/or DIP switches. Cards areavailable to support almost all networking standards, including thelatest Fast Ethernet environment. Fast Ethernet NICs are often 10/100capable, and will automatically set to the appropriate speed. Fullduplex networking is another option, where a dedicated connection to aswitch allows a NIC to operate at twice the speed. NIC cards withmultiple terminations capable of supporting multiple physical layerprotocols or within protocol types are preferred so that the NIC cardsinclude transceivers used to connect nodes to the various Ethernetmedia. Most computers and network interface cards contain a built-in10BASE-T or 10BASE2 transceiver, allowing them to be connected directlyto Ethernet without requiring an external transceiver. Many Ethernetdevices provide an AUI connector to allow the user to connect to anymedia type via an external transceiver. The AUI connector consists of a15-pin D-shell type connector, female on the computer side, male on thetransceiver side. Thickwire (10BASE5) cables also use transceivers toallow connections. For Fast Ethernet networks, a new interface calledthe MII (Media Independent Interface) was developed to offer a flexibleway to support 100 Mbps connections. The MII is a popular way to connect100BASE-FX links to copper-based Fast Ethernet devices. Wirelessbackbone physical layer network connections are made using RF networkreceivers made by companies such as Linksys, Cisco, IBM, DLINK, andothers, using wireless protocols such as 802.11x, UWB, Bluetooth, andmore as the network interface 610.

The network speaker controller 620 is an embedded controller with flashmemory programmed to function as a web server and enabled with the audioformat converter 621 to allow their application programming to beupdated over the network, the legacy Audio Converter/Controller 600 canbe made to code audio to currently unknown digital formats in thefuture. As in the speaker embodiments described above, the desired audioto be distributed will likely be in a coded and/or compressed format.Digital audio coding or digital audio compression is the art ofminimizing storage space (or channel bandwidth) requirements for audiodata. Modern perceptual audio coding protocols, synonymously calleddigital audio compression techniques, like MPEG Layer-III or MPEG-2 AAC,ATRACK3, G2, WMA, Ogg Vorbis, or WMAPro, “Corona”, exploit theproperties of the human ear (the perception of sound) to achieve arespectable size reduction with little or no perceptible loss ofquality. As described above, this compression, in addition to reducingthe sampling rate and the resolution of the audio samples employeesperceptual coding techniques addressing the perception of sound waves bythe human ear, that remove the redundant and irrelevant parts of thesound signal. The sensitivity of the human auditory systems for audiosignals varies in the frequency domain being high for frequenciesbetween 2.5 and 5 kHz and decreasing beyond and below this frequencyband. The sensitivity is represented by the Threshold In Quiet Any tonebelow this threshold will not be perceived. The most importantpsychoacoustics fact is the masking effect of spectral sound elements inan audio signal like tones and noise. For every tone in the audio signala masking threshold can be calculated. If another tone lies below thismasking threshold, it will be masked by the louder tone and remainsinaudible too. These inaudible elements of an audio signal areirrelevant for the human perception and thus can be eliminated by thecoder. The sound resulting after coding and decoding is different, butwill be perceived more or less the same by the human ear. How closely itwould sound to the original depends on how much compression had beenperformed.

Audio compression actually consists of two parts. The first part, calledcoding or encoding, transforms the digital audio data that resides, say,in a WAVE file, into a highly compressed form called bitstream (or codedaudio data). To play the bitstream on your soundcard, you need thesecond part, called decoding. Decoding takes the bitstream andreconstructs it to a WAVE file. Highest coding efficiency is achievedwith algorithms exploiting signal redundancies and irrelevancies in thefrequency domain based on a model of the human auditory system. Currentcoders use the same basic structure to produce coding that can bedescribed as “perceptual noise shaping” or “perceptualsub-band/transform coding”. The encoder analyzes the spectral componentsof the audio signal by calculating a filterbank (transform) and appliesa psychoacoustics model to estimate the just noticeable noise-level. Inits quantization and coding stage, the encoder tries to allocate theavailable number of data bits in a way to meet both the bit rate andmasking requirements. The decoder is much less complex. Its only task isto synthesize an audio signal out of the coded spectral components.Psychoacoustics describes the characteristics of the human auditorysystem on which modern audio coding technology is based. For the audioquality of a coded and decoded audio signal the quality of thepsychoacoustics model used by an audio encoder is of prime importance.

The audio format converter 621 performs audio data compression andencoding of audio formats into the audio format acceptable fordistribution to the end receiver on the network and can consist of anaudio encoder-decoder chip (CODEC). To best perform the audio formatconversion function, the CODEC should be designed to handle all types ofaudio content, from speech-only audio recorded with a low sampling rateto high-quality stereo music. The CODEC should be very resistant todegradation due to packet loss, and have efficient encoding algorithmsto perform fast encodes and decodes, and to minimize the size of thecompressed audio files, and still produce quality sound when they aredecoded. Also, the CODEC should be highly scalable and providehigh-quality mono or stereo audio content over a wide range ofbandwidths, to allow selection of the best combination of bandwidth andsampling rate for the particular content being played or recorded.Content encoded at 192 Kbps by the CODEC should be virtuallyindistinguishable to a human ear from content originating on a compactdisc. This extremely high-quality content is called CD transparency.

The analog to digital converter 622, commonly referred to as an ADC,converts an analog level input to a digital output. Adding a microphonespeaker input to the ADC will enable voice control of the legacy AudioConverter/Controller 600. It would also enable the legacy AudioConverter/Controller 600 to record audio input for later use as systemmessages or audio feedback. Depending on the software application andprogramming in the network controller 620, the audio input may be storedlocally for future audio feedback, used locally, or it may be fed out tothe network through the network interface 610. The audio input could beused with a voice recognition application to initiate spoken audio orhome control commands.

The Legacy Audio Converter/Controller 600 may also communicate with thelegacy sources 500 using a legacy communication method, such as IR orserial commands, that are understood by the legacy device. The plannedembodiment of the invention will use the fixed set of serial commandsalready understood by the target legacy source. The network controller620 controls and communicates with a legacy controller 624, which alsocommunicates with the legacy source 500 through a legacy audio networkinterface 623. In a preferred embodiment of the invention, a RS-232serial command interface will be used. The functions of the networkcontroller 620 and the legacy controller can be combined into oneembedded controller.

The Legacy Audio converter/Controller 600 may also communicate with thelegacy home control network 700 using the network communication methodunderstood and practiced by the legacy home control network 700 and suchcommunication may vary greatly depending on the legacy home controlnetwork 700 being used. A preferred embodiment of the invention will usethe CEBus powerline protocol for its communication method. The legacycontroller 624 controls and communicates via a legacy home controlnetwork interface 625, with a legacy home control network 700. Thefunctions of the legacy controller in controlling the legacy sources 500and the legacy home control network 700 and the legacy controller couldbe separated out into two separate embedded controllers, or combinedwith the network controller 620. If no legacy source 500 is available,the legacy audio network interface 623 and the legacy source controlfunction of the legacy controller 624 may be eliminated. Similarly, inthe absence of a legacy home control network 700, the legacy homecontrol network interface 625 and the legacy home network controlfunction of the legacy controller 624 may be eliminated.

As illustrated in FIG. 9 network speaker 100F can receive DC currentfrom external regulated power supplies over existing 14-18 AWG speakerwire or can employ PoE (Power over Ethernet) technology to tricklecharge the battery. Also, charge status can be provided for the battery800.

Network speaker 100F has power applied as DC current from a rechargeablebattery source 800 either located within the speaker or inserted intothe speaker as a removable battery pack. This would also allow for linepower status, which would perform a function specific to the applicationonce this condition occurs.

FIG. 10 depicts another speaker embodiment 100G which also can bebattery powered. In addition, the speaker 100G includes an ESM (EnergyStorage Module) which improves audio performance.

Obviously, many modifications and variations of the present inventionare possible in light of the above teachings. It is to be understood,therefore, that the invention can be practiced otherwise than asspecifically described.

To summarize, this disclosure describes a networked audio output devicein an audio video distribution system that uses a local area network.The networked audio output device includes a network speaker node; andan audio output device that transmits the analog audio signal and thatcouples to the network speaker node through the speaker/microphonedriver. The network speaker node further includes a controller with anetwork interface that couples to the local area network and controlsthe processing of the digital audio signal, where the controller furthercomprises an embedded controller with memory and which is programmed tofunction as a web server. The network speaker node further includes adigital signal processor that couples to the controller, where thedigital signal processor processes and formats the digital audio signaland the analog audio signal and converts the digital audio signal to andfrom the analog audio signal. And, the network speaker node furtherincludes a speaker/microphone driver that couples to the digital signalprocessor and provides the external connection for the analog audiosignal.

The networked audio output device further provides that the audio outputdevice may include headphones or include one or more speakers.

Additionally, the networked audio output device further includes speakersensors that couple to the digital signal processor and provide feedbackand allow for sending control signals back to other devices in the localarea network.

Other embodiments of the invention will be apparent to those skilled inthe art after considering this specification or practicing the disclosedinvention. The specification and examples above are exemplary only, withthe true scope of the invention being indicated by the following claims.

1. A networked audio output device in an audio video distribution systemthat uses a local area network, comprising: a network speaker node thatfurther comprises: a controller with a network interface that couples tothe local area network and controls the processing of the digital audiosignal, said controller further comprises an embedded controller withmemory and which is programmed to function as a web server; a digitalsignal processor that couples to said controller, said digital signalprocessor processes and formats the digital audio signal and the analogaudio signal and converts the digital audio signal to and from theanalog audio signal; and a speaker/microphone driver that couples tosaid digital signal processor and provides the external connection forthe analog audio signal; and an audio output device that transmits theanalog audio signal and that couples to said network speaker nodethrough said speaker/microphone driver.
 2. The claim according to claim1 wherein said audio output device are headphones.
 3. The claimaccording to claim 1 wherein said audio output device further comprisesone or more speakers.
 4. The claim according to claim 1 furthercomprising speaker sensors that couple to said digital signal processorand provide feedback and allow for sending control signals back to otherdevices in the local area network.
 5. A method to manufacture anetworked audio output device in an audio video distribution system thatuses a local area network, comprising: providing a network speaker nodethat further comprises: a controller with a network interface thatcouples to the local area network and controls the processing of thedigital audio signal, said controller further comprises an embeddedcontroller with memory and which is programmed to function as a webserver; a digital signal processor that couples to said controller, saiddigital signal processor processes and formats the digital audio signaland the analog audio signal and converts the digital audio signal to andfrom the analog audio signal; and a speaker/microphone driver thatcouples to said digital signal processor and provides the externalconnection for the analog audio signal; and coupling an audio outputdevice that transmits the analog audio signal and that couples to saidnetwork speaker node through said speaker/microphone driver.
 6. Theclaim according to claim 5 wherein said audio output device areheadphones.
 7. The claim according to claim 5 wherein said audio outputdevice further comprises one or more speakers.
 8. The claim according toclaim 5 further comprising speaker sensors that couple to said digitalsignal processor and provide feedback and allow for sending controlsignals back to other devices in the local area network.
 9. A method tomanufacture a networked audio output device in an audio videodistribution system that uses a local area network, comprising:processing the digital audio signal to and from the local area networkwith a network speaker node that further comprises: a controller with anetwork interface that couples to the local area network and controlsthe processing of the digital audio signal, said controller furthercomprises an embedded controller with memory and which is programmed tofunction as a web server; a digital signal processor that couples tosaid controller, said digital signal processor processes and formats thedigital audio signal and the analog audio signal and converts thedigital audio signal to and from the analog audio signal; and aspeaker/microphone driver that couples to said digital signal processorand provides the external connection for the analog audio signal; andtransmitting the analog audio signal through an audio output device tosaid network speaker node through said speaker/microphone driver. 10.The claim according to claim 9 wherein said audio output device areheadphones.
 11. The claim according to claim 9 wherein said audio outputdevice further comprises one or more speakers.
 12. The claim accordingto claim 9 further comprising speaker sensors that couple to saiddigital signal processor and provide feedback and allow for sendingcontrol signals back to other devices in the local area network.